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ABSTRACT 



A nmlti-node network manifests a changing topology of 
individual data processing nodes. An originating node trans- 
mits an initial identifier (ID) message over each interconnect 
link that enumates from itself. A recdvei in the originating 
node receives an acknowledgement (ACK) message from 
each neig^bcH* node, each ACK message including a neigh- 
boxnode's link identifier for the link over which the ACK 
message was^transinittedrThe originating node constructs 
and stores a topology table entry which includes data from 
received ACK messages. Each entry includes a node 
identifier, an originating node link idcntiiier and a neighbor 
node identifier from which an ACK message was received 
and a neighbor node link identifier for the Hnk. An update 
procedure causes the transmitter to transmit to all other 
nodes, the originating node entry and further causes the 
topology table entries received from other nodes to be 
entered in the topology table of the originating node, so that 
all nodes in the system are enabled to thereafter identify the 
topology of the system. 

10 Claims, 4 Drawing Sheets 
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SYSTEM AND METHOD FOR 
DETERMINING THE TOPOLOGY OF A 
RECONFIGURABLE MULTI-NODAL 
NETWORK 

FIELD OFTHEINVENnON 

This invention relates to recon%iirable interconnection 
networks in multi-node data processing systems and, more 
particularly, to a system and method for enabling each node 
to rapidly determine the topology ci the multi-node network. 

BACKGROUND OF THE INVENTION 

Id a reconfigurable multi-node netwodc, maintaining a 
consistent view of netwcfk topology in each node requires 
solution of a number of problems. For instance, many nodes 
do not have direct access to disk storage wherein topology 
information can be stored. In any event, each node should be 
aWe to discover the network topology at initialization time 
without any preexisting information regarding the existing 
topology. Further, the topology observed by eadi node 
should be able to be updated when a topology change occurs 
as a result of a failure of a node or link or the addition or 
subtraction of a node or a link. The solution to these 
problems is difficult, especially when network topology is 
large, dynamic and further* when communication overtiead 
within the multi-Dode network must be minimized. The prica- 
art has suggested various methods for enabling nodes in a 
multi-node networic to maintain knowledge of the netwoik's 
topology. 

U.S. Pat. No, 4,912,656 to Cain et al., describes a satellite- 
based multi-node network wherein independent nodes deter- 
mine what connections they can make to **in^ove network 
productivity'* and the identified connections are &en broad- 
cast to all nodes in the network. Each node receiving the 
proposed connectivity changes then resolves conflicts and 
determines what diange should be made in the individual 
nodes. 

U.S. Pat No. 4,914471 to Baratz et al., describes a 
multi-node system wherein a requesting node first deter- 
mines which resources reside within itself and then, if 
desired resources are not found, searches resources (known 
to a server node) which reside elsewhere in the network. The 
search &en continues (if the requested resource is not found) 
to all associated nodes, etc.. U.S. Pat. No. 4,987^36 to 
Humbler provides a system for determining a shortest path 
from a starting node to a destination node. The system 
enables each node to fonn a routing tree and to conmmnicate 
that routing tree to each adjacent node. Modifications to the 
routing tree are then made in acctH-dance with information 
from adjacent nodes. 

U.S. Pat No. 4,995,035 of Cole et al., partitions a network 
into focal point nodes and non-focal point nodes and enables 
each focal point node to maintain a sphere-of -control table 
which lists the non-focal point nodes served by itself. U.S. 
Pat No. 5,049,873 of Robins et al,, describes a system for 
gathering status infcomation regarding a communications 
network wherein a monitor node maintains topology data for 
the network. A method for updating the topology informa- 
tion is described. 

U.S. Pat No. 5,051,987 to Conlin discloses a multi-node 
network wherein each node accesses information relating to 
a then current topology of a network and transmits a 
message through appropriate connecting links. During an 
update state, each node interrogates each neighboring node 
regarding nodes which neighbor it The process continues 
until all nodes have returned infonnation regarding their 
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ne^boring nodes — ^thus enabling each node's topology to 
be updated, Coan et al. in U.S. Pat No, 5,093,824 describes 
a multi-node network wherein eadi node stores a precom- 
puted configuration table that corresponds to each of a 

5 plurality of possible nctworic topologies which can result 
from a number of possible failure events. When such a 
failure event occurs, the precon^ted configuration corre- 
sponding to the failure event is accessed and used for 
inter-node communication controL 

10 U.S. Pat. No. 5,130,974 to Kawaraura et aL describes a 
data communication network wherein nodes are intercon- 
nected by regular and spare routes. In the event of a line 
fault, request signals arc transmitted to obtain permission to 
establish a new regular route to an adjacent node. The 

15 control network is then dynamically reconfigured in accor- 
dance with the presence or absence of request and grant 
signals. Other routing algorithms can be found in *'New 
Routing Algorithms for Large Interconnected Networks" 
Bar-Noy &L al, IBM Technical Disclosure Bulletin, Volume 

20 35 No. 11992 pages 436, 437 and in published Japanese 
appUcation 04-207239 of Masatoshi et al. 

In general, the above noted prior art requires that each 
node have at least some information regarding immediate 
neighbor nodes and, in certain Instances, requires a pre- 

^ loading of initial topology information from a centralized 
source. Further, considerable inter-node communications are 
required, especially when large tree configurations are trans- 
ferred from node to node during the topology update pro- 
cess. This problem becomes especially in^rtant when 

^ dealing with large multi-node networks coii5)rising hun- 
dreds and possibly thousands of interconnected nodes. 

Accordingly, it is an object of this invention to provide a 
system and method for determining the topology of a 
multi-node network wherein message traffic is minimized. 

It is another object of this invention to provide a system 
and method for determination of multi-node network tcpol- 
ogy wherein a centralized source of topology infonnation is 
avoided. 

40 It is yet another object of this invention to provide an 
in^roved method fcr enabling each node of a multi-node 
system to determine the system's network topology without 
requiring a pre-loading of information regarding network 
topology. 

45 It is yet another object of this invention to provide a 
method and system which enables each node of a multi-node 
network to establish a local network topology table, and 
minimizes message sizes to inq>lement sudi niethod. 

SUMMARY OF THE INVENTION 

A multi-node network manifests a dynamic and changing 
topology of individual data processing nodes. Each node is 
connectablc to plural other nodes via full duplex intercon- 
nect links. Each node includes a processor, memory and 

55 programming means for enabling discovery of the reconfig- 
urable topology. An originating node includes a transmitter 
which transmits an initial identifier (ID) message over each 
interconnect link that emanates from itself. Eadi ID message 
includes an originating node link identifier for the link over 

60 which the ID message is transmitted. Logic in the originat- 
ing node provides a *time ouf * signal at an expiration of a 
time interval. A receiver in the originating node receives an 
acknowledgement (ACK) message frran each neighbor 
node, each ACK message including a neighbor node's link 

65 identifier for the link over which the ACK message was 
transmitted back to the originating node. The originating 
node constructs and stores a topology table which includes 
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data from received ACK messages. The topology table 3. Adjustrng—occurs when changes (addition / deletion / 
includes originating node and neighbcMTnode entries for each failure) are detected about nodes or links between 
neighbor node from which an ACK message is received. nodes. This is reported by the nodc(s) detecting the 
Each entry includes a node identifier, an originating node change and is iteratively forwarded through the 
link identifier and a neighbor node identifier from which an 5 network, as described above. The second time-out is 
ACK message was received and a neighbor node link used to detomine if the network is back in a stable 
identifier for the link. The protocol associates a null value state. When the node transits to the Stable state, the 
with any link over which no ACK message is received prior topology table in each node contains the same updates, 
to the first time out signal. An update procedure causes the Referring to FIG. 1, an exenq)lary nine node data pro- 
transmitter to transmit to all other nodes, the originating lO cessing network 10 is illustrated, con^aises nodes K1-N9 
node entry and further causes the topology table entries and implements the topology protocol described above, 
received from other nodes to be entered in the topology table Each node includes four fuU duplex communication links 
of the originating node, so that all nodes in the system are that are schematically shown in FIG. 1 as links 12. 14, 16 
enabled to thereafter identify the topology of the system. and 18 for node Nl. Further, it is noted that node Nl denotes 

^ „ 15 each of those respective Unk as link ports a, b. c, and d. By 

BRIEF DESCaomON OF THE DRAWINGS N^Tnotes link port M as Knk port "b" and 

FIG. 1 is a block diagram of an exemplary multi-node node n2 denotes link 12 as link port "c**. Thus, each node 

data processing system. has its own designator for a link which may or may not be 

FIG, 2 is a block diagram of con^nents of an excmplaiy the same as a node connected at the other end of a link, 

data processing node shown in the multi-node system of 20 Hereafter, a node connected one link away from another 

FIG. 1. node will be termed a "neighbor" node. As will become 

FIG. 3 is a topology table which is constructed and hereafter ^parent, each node has a unique identifier (eg. 

maintained in each of the nodes of FIG. 1. N1-N9) and each node is able to botfi transmit and receive 

FIG. 4 is a state diagram which illustrates the various messages over each of links, 12, 14, 16 and 18. Further each 

stales and procedures which occur that enable the construe- 25 node includes a timer facility for generating time out events, 

tion and updating of the topology table of HG. 3. Turning to FIG. 2, eadi of nodes Nl-^^9 is preferably 

^„ r^r^ configured from a common node arrangement of modules. 

DETAII^ DESOT™n OFTHE Each node includes a node processor 20 that controls the 

^^>^^^^^^OS overall functions of the node. Each node further includes a 

The protocol to be hereafter described enables nodes in a 30 control message facility that comprises a control memory 

multi-node network to identify the network topology with- interface 22. a state machine 23, and a control message 

out previous information as to the topology. The protocol is memory (DRAM) 24. DRAM 24 also is employed to 

preferably inaplemented by a software established state maintain a topology table 26 whose functions will be 

machine in each node in the network. The protocol csscn- considered below. 

tially comprises three phases; 35 A data message facility is further included in each node 

1. Self-Positioning — ^wherein each node determines what and comprises a data birfiFcr interface 28 at a data memcay 
connections it has to its immediate neighbor nodes, if (DRAM) 3ft. Data buffer interface 28 connects to a plurality 
any. Based on message exchanges during this phase, a of device interfaces 32, 34, etc. which, in turn, connect to 
node constructs an entry about itself in an otherwise disk drives 36, 38 eta Control messages originating from 
cnqjty topology table within the node. A first tim&-out 40 control memory interface 22 enable control of the various 
is used to determine when this phase ends. The first nodal actions within the node. 

time-out begins when a node transmits initial greeting The respective control and data message facilities allow 

messages over its data links. This phase transits either independent processing and transfa: of control and data 

to a Connecting state, where the node finds out about messages to an input/output (I/O) switdi 40. JJO switch 40 
other nodes on the network, or if no neighbor nodes are 45 includes means for both receiving messages which arrive on 

heard from, to a Stable state as a "network" of one any of link ports a, b, c, or d and fcff transmitting messages 

node. ovCT those links to neighbOTing nodes. Once a message is 

2. Connecting— when the node tells other nodes in the processed in I/O switch 40 and passed to control memory 
network about itself, and conversely, finds out what interface 22, the identity of the link ovea* which the message 
other nodes are in the network. This is accon^lished by 50 was received is lost. 

eadi node sending to '*A11 nodes** the topology about Control memwy interface 22 is enabled to operate sub- 

itself (i.e. an Update message). This also indudes stantially indcpcndcntiy of node processor 20 undear a num- 

iterative forwarding of each node*s topology row** bcr of circumsUnces. One of those circumstances is during 

throughout tiie network. Redundant traffic is avoided initiation when topology table 26 is generated and stored, 
by each interim node forwarding the information only 55 Control memory interface 22 includes a state machine 23 

when it differs from the content of its topology table. If whidi enables a node to independentiy develop an intercon- 

thc contents match (meaning the message has been seen nect **row" for itself in topcdogy table 26 and to transmit its 

before), the message is discarded. The topology infor- respectively generated interconnect row to all other nodes 

raation for a given node comes only from the node within network 10 (FIG, 1). Topology table 26 can be 
itself. This eliminates the possibility of information eo represented as a table, but those skilled in the art will realize 

being out of synchronism which, in turn, can result in that its actual data structure is not necessarily tabular in form 

excess message trafBc. A second time-out is used to and that it is just necessary that it associates with each node 

determine when this phase ends. At the end of this identifier, the desired link information so as to enable tiiat 

phase, the node is in the Stable state and the node*s information to be appropriately communicated to other 
topology table contains an entry for each node in the 65 nodes in the network. 

network. Further, each node's topology table is iden- FIG. 3 illustrates topology table 26 which comprises three 

tical. main columns, Le., a node identifier (ID) column 50, a 
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column 52 designating neighbor nodes that arc connected to The INTT message is transmitted by each initializing node 
each of the identified node's links and a cohmm 54 desig- to notify its neighbors about its edstence and to ask for 
nating link identifiers assigned by the neighbor nodes to admittance to the network. An ACK message is used by a 
eadi of the interconnecting links to the node ID listed in ^odc which has received an DOT message from an initial- 
column 50. Thus, node ID column 50 includes a row for 5 izi»g neighbor node, to acknowledge and admit that ncagh- 
each of nodes N1-N9. ^ ^ network. The UPDATE message is used to 
"Neighbor on link" column 52 comprises four ask the receiver of the message to UPDATE (includmg add 
subcohmins, one for each of a node's link p«ts a^. For or delete) a row of topology information in the recdving 
each nodelD listed in column 50, there is an entry in column node s topolo^r table. ^ ^ u a a 

52 which identifies the neighbor node connected to the lO ^if^.S.^^ T.'^^XT 

1 ji- 1 ^ Sn. fr^^^A^\jt «r.^oxr'>;*. node field and the Send link field are transmitted and identify 

respectively not^ hnk port TTius^ for node Nl node N2 ^ ^ respectively, from which the message 

connected to hnk port a. No nodes are connected to link .^^^^ j^^^ ^^j^ ^^^^^^ 

ports b and c, so null values are mserted. LasUy, as node N4 ultimate destination of the message. This 

is connected to link port d, its identifier appeared in the "d" ^^1^^ checked by each receiving node to determine 

subcohimn. 15 whether the message needs to be forwarded to other nci^- 

Coiumn 54 includes Unk port entries for each "link from bor nodes. To minimize message traffic (and unnecessary 

neighbor" designator. For the row corresponding to node message forwarding), each Dest node implements a proce- 

Nl, link port c in node N2 is connected to link pOTt a in node dure according to the following rules: 

Nl. Similarly, node N4 has its link port "b" connected to the (i) if oest node equals Recv node, no message forwarding 

d link port of node Nl. 20 is required; 

Each of nodes N1-N9 is initially responsible for deriving (2) if Dest node equals All nodes and the topology row 

and updating its particular row of topology table 26. Then field in the received message is different from the 

that row is communicated to aU other nodes. Each node has corresponding row in the topology table maintained in 

no responsilMlity for either deriving or updating any row of the receiving node, tiie message is forwarded to all 

topology table 26 other than its own designated row, 25 immediate neightK)r nodes except the previous sender, 

To implement the topology discovery protocol, network (3) if Dest node equals All nodes and the topology row is 

10 anploy three messages data structures, i.e., INTT, ACK the same as the corresponding row in the topology 

and UPDATE. Li addition, two time delays TOl and T02 are table, the message is discarded and not further for- 

used during the protocoL TOl is set equal to the maximum warded; 

round trip delay for communications between a pair of 30 (4) if Dest node is not recognizable, the message is 

neighbor nodes. T02 is set equal to the maximum round trip forwarded to the neighbor nodes as in case (2). 

delay between a pair of nodes located at the farthest ends of Protocol **events*' are generated by state machine 23 or by 

network 10. other processes in the node. As described above, there are 

An INIT (or initiation) message includes the following eight types of events. Each type of event defines its event 

fields (not all of which are used during each message): "All 35 data which needs to be processed by the state machine 23. 

nodes*' — ^a value that represents aU of the valid nodes in An "initialization'' event is generated when a node is to be 

network; "Void link" — indicates an invalid (i.e., initialized. A "disconnection" event is generated when the 

uninitialized) link , identifier; '*Send node" — indicates the node needs to disconnect itself from the network. A "node 

identifier of the node sending an INIT message; "Send link" isolated event" is generated when the node carmot find any 

identifies the link port ovgt which the INIT message was 40 neighbor node communicable and declares itself as the only 

tmnsmitted from the send node; and "Dest node" — indicates active node in the network. A "node admitted" event is 

the identifier of the node to which the INIT message is generated when the node is admitted to the network by all of 

eventually destined. its communicating neighbors. A **thne out" event is gener- 

An ACK (c^ acknowledgement) message is a message ated by a timer when the time delay has expired. A "link 
transnutted by a node in response to receipt of an INTT 45 failed" and "link added" event is generated by other pro- 
message. The fields in the ACK message are as follows: cesses when it is found that fbc status of a communication 
**Rccv node" — indicates the identifier of the node originat- link on a node has dianged. 

ing the ACK message; **Recv link" — indicates the Unk port Before proceeding to a specific description of the state 

of the neighbor node over whidi the ACK message is t>eing diagram of FIG. 4 and a comprdjensive listing of the various 
transmitted to the node which generated the INIT message; 50 states and their actions, a brief description of the topology 

"Send node" and "Send linl^* fields — both are identical to protocol will be given with respect to network 10 (FIG. 1). 

the Send node and Send link fields contained in the INIT As indicated above, at initialization, each node is totally 

message; and a *Ttest node" field — defines the ultimate node unaware of any other node in the network, but does know its 

for which the ACK message is destined own link pcrts. Each node commences by sending out an 

An UPDATE message includes the same first five fields 55 INTT message over its link ports. Any neighbor node (one 

listed above for the ACK message as well as a field which link away) which receives the INTT message provides an 

contains the entire 'topology table row"for the node origi- ACK message back to the originating node. However, 

nating an UPDATE message. because the neighbor node does not necessarily know over 

The above messages are dispatched and bandied upon the which of its link ports the message was received (because 
occurrence of an "event". Those events arc as follows; 60 the identity of the link port is lost once the message is passed 

initialization; disconnection; time out; node isolated; node into control memory interface 22), the ACK message is sent 

admitted; link failed and link added. Each event defines a to all nodes connected to the neighbor node with the 

condition where the network is either initializing its topol- originating node*s Send node, Send Unk and Dest node fields 

ogy tables or a node or Unk has either been added, beconae appended. The receiver of an ACX message wiU update its 
inoperative or disconnected. In each case, one or more 65 topology table only if the Dest node ID matches its own ID. 

protocol messages is dispatched to accon^lish an update of Other nodes disregard the ACK message as their node IDs 

the netwcHrk*s topology tables and state machine. do not match the Dest node ID of the ACK. message. 
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Once the origmating node receives the ACK message machine 23 is driven by the occurrence of the events 

from each of its neighbor nodes, it is enabled to insert node described above, ^ 

connection data into its row of topology table 26, If timer — State machine 23 has five states. 'TOsconncctcd" is a state I 

TOl expires befwe an ACK message is received over a lint whidi indicates tiiat a node is disconnected from the net- / 

the QciginatiDg node assumes that ttiere Is no node at the 5 work In this state, a node is not allowed to communicate ( 

other end of the respective link and inserts a null value into with other nodes and its topology taWe is void. The "Self 

its topology table at the q)projpriatc link port Positioning" state means that the node is learning its topo- 

At this stage, nodes N1-N9 have now accumulated suf- logical position from its neighbor nodes. In this state, the 

ficicnt data to con^lete their respective row in eadi nodc*s node is only able to communicate with its immediate ncigh- 

protocoi tabic 26. Thus, node NX has identified all of its lO bors. The "Connecting** state means that a node has learned 

neighboring nodes, which node Nl link pcMts they are its topological position from its immediate neighbors, Lc., 

connected to and the neighboring node's connecting link built its own topology row, and starts to tell other nodes 

port identifiers. Similarly, node N2 has derived its respective about its topological position. The "Adjusting" state means 

topology row, as have nodes Nl . . . N9. that the node is receiving topology update information from 

Each node now sends to "All nodes" its respective topol- 15 other nodes and is merging the changes into its topology 

ogy row data for entry into all other node's topology tables table. The "Stable" state indicates that Ae node considers the 

26. By each node sending its own topology row» a possible topology stable since it has not received a topology update 

race condition is avoided were a non-originating node to from any node within the time duration T02. As will be 

send topology information concerning another node, while a recalled, time duration T02 is the round trip time for a 

topology change was occurring at the "another"' node. To the 20 message from the farthest node in the network. The initial 

extent that other nodes receive duplicate information, that state of state machine 23 (at power-on) is the Disconnected 

information is discarded and the row data is updated accord- state. 

ingly. Eadi topdLogy table can further be updated during ran The operation of state madiine 23 in passing through the 

time throu^ use of the Update message. five stages shown in FIG. 4 will be completely ^parent 

The above operations will be understood in further detail 25 from the description below in Table 1 of the various state . 

by referring to FIG. 4 and the listing of topology state transitions which occur in response to the specificaDy indi- / 

transitions listed below in Table 1. As shown in FIG. 4, state cated events. — j 



TABLE 1 



Ibpobgy Protocol State Ti^nsitions 

Current 

State Event Actioi^s) Next Slate 



DISCON- 
NECTED or 
SELF_J*OSniONING 
or STABLE 



STABLE or 

CONNECTING 

or 

ADJUSTING 



SELF__POSI- 

nONING 

SELF_JPOSI- 

nONING 

CONNECTING 

ADJUSTING 

SnCABLE 



SELF__POSI- 
TIONING 



Initiflli- 1) Initialize my Topology_JRow 

zatkm as fbOows. 

node = nay node id; 
nbr_jK)de[.l = VOID_JNODE 
rev_link[.] =*VOn>_JJNK; 

2) Send INTT Id every neighbor 
CO the [ink from which no ACK 
has been received by the node 
yet 

INTrjocv_node = ALU^NODES; 
INirjocv_Jink = VOID_LINK; 

3) Start ttiDer with time delay 
TOl 

DificoQittc- 1) Seal an UPDAIE to eveiy 
tion Deighbor on the active link as 

follows. 

UPDAlE-recv_nodc = 
UPDAIKdest__node = ALU-NOUES; 
UPDAIE.Pccv_Jiiik = the 
outgoing Imk id. 

UPDAIE.Tbpo]ogy_Rowj)ode = ny 
node id. 

UPDAaE.Topo]ogy__Rowjihr_j»de[. 
]=V0ID_J^ODE; 

UEDAIB.Tbpok)gy_JtOWjrev_Jink[. 
]=VOID_UNK: 

2) \cad cvay 'Ibpofogy_Row ia 

topology table. 
DisconnDC- I) Vend my Topok3gy_Row in 
tion topology table. 

INTT 1) Seal ACK to the neighbor on 

eveiy possible link as 

follows. 

ACKrccv^nodc = 
INTTseod^jiDde; 
ACK.iccv_Jink = 
INir.se[MLJink. 
ACK I) Update my 1bpok>gy_Row as 

follows. 



SELF__POSI- 
TIONING 



DISCON- 
NECIED 



DISCON- 
NECTED 
SElP_J*OSI- 
•noNING 
CONNECTING 
ADJUCTNG 
STABLM 



SELP_POSI- 
TIONING 
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TABLE l-continued 



Ibpoloflv Protocol State TtansitioDS 



10 



Cnircnr 
State 



Event Action(s) 



Next State 



SELF_POSinONING 'Hme^Out 



SBLF. 



— POSITIONING 



SELF. 



Node_Jso- 
tated 

ontted 



CONNECTING 
or 

ADJUSTINO 

or 

SIABLE 



UPDAIE 



nhp_j2DdctACKjrecv_Jink] = 
ACKjcndnode; 
rCT_Jink(ACKjecv_Jink] = 
ACK.sciKLJink; 
2) Restart the timer with time 
deky TDl; 

1) If aU tbe neighbors are 
not comiminicable, i^., my 
'IbpoIogy_Row is c]eai except 
the node field, geoctate 
Node_j5ola£ed event. 

2) Odierwise, geociate 
Node__Admitted event. 
Kg action. 

1) Send an UPDATE with my 
Tbpolcgy_Row to every neighbor 
connected by an active link as 
follows. 

UPDATE jecv__XLode = 
i^_nodc[link]; 
UPDAIE jccv_jink = 
rcv_jink|linkj; 

UPD AlE,dest_nodc = ALUNOTES; 
UPDAIE-lbpology^Jlow = my 
Topolo3y_Row; 

2) Start the titner with time 
delay 1D2. 

1) CocDpare UPDATE 'Ibpology_Row 
with the Topote£y_JlDw 
addressed by UPDAIE-lbpology- 
RowjKxle m topology table. 

If match and UPDAIE.dsOK>de = 
ALU-NODES, discaid the 
received UPDATE and skip the 
following steps. 

2) If UPDArE.dest_j»de = ALL 
NODES, then forward tlPDAIE to 
every ocighbor on the active 

link except the previous 



SELF_ 

POSITIONING 



STABLE 



CONNECTING 



ADJUSTING 



3) If my node is specified as 
a nrighhoT on (he link of the 
node in UPDAIE.Tbpobgy_Row 
but the node is not a neighbor 
defined in my Topolqgy__Row, 
update my Topology_Row as 
follows. 

nbr_nDde [UPDAEE.Tbpology.Jtow. r 
ev_link [linfcU = 
UPDAIE.Topokjgy_Rowjiode; 
reY_IinkrUPDAIE.Tbpok)gy_Jlowj 
ev_linlc [linkl] = 
UPDAIE.Topobgy_Rowjiode; 
rBV_linkfUPDAIE.Topofcigy_Rowx 
cv_link [Unkll = link; 

4) If my node is not specified 
as a oeigfaboT of the node in 
UPDAEE.Topok)gy_Row but the 
node is a neig^ibor on ^ Imk 

m my Tcpok)gy_Row, update my 
Tbpotogy^Jtow as follows. 
nhr_nodetlink] = VOID_JJtt«:; 
rev_Jinkniiik] = VOID_JLINK; 

5) In case of (3) or (4% scrri 

an UPDATE of iny TbpoloiEy__Row 

to every neighbor on the 

active link as foBows. 

UPD ArEjccv_jiode = 

nht_node[link] 

UPDATEjecv_link = 

icv_Jintninkl; 

UPDAFE-dcst^jaxle = ALL-NODES; 
UPD AIH.Topo k)gy__Row = my 
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Cunent 
State 



Ibpotogy Protocol State Ttansitiqns 



EtchI ActkmCs) 



Neart Stat© 



'IbpoJogy_Row; 

6) If the 'Ibpolo>gy_Row in my 
topology tabic addressed by 
the received 

UPDArE.'Ibpok)gy__R(WJiodc is 
VOID, send an UPDAIE to the 
node as follows. 
UPDATE Jccv__nodc = the 
received 

UPDATE jeiKLjxxJe; 
UPDAIE Jecv__link = the 
received 

lJPDAIE.3en4_link; 
IJPDAIE.seDcLliiik = the 
received 

UPDAIE jecv_liiik; 
UPDATE Jcst_riodc = the 
received 

UPDArE.Topobgy__Rowjiode; 

7) Copy the received 
UPDArE.'Ibpofcigy_Row lo the 

ACK l)If myiibr_i»de ADJUSTING 

lACKjDCv_link] = VOID_NODE, 
update my Topok>gy_Row as 
follows: 

nhr_node (ACKjec_Jink] = 

ACK-scndljJodc; 

r^v linV [ACKjec_]ifiV] = 

ACK-seiKL-link; 

2) Send an UPDATE of my 
Topology_Jtow to eveiy neighbor 

oa the active link as step (5) * 
in ttxt above action. 

3) Restart the timer with time 
delay T02. 

•Eme—Out No action STABLE 



ADJUSTING 



CONNECTING 
or 

ADJUSTING 
CONNECTING 

or 

ADJUSTING 
or 

SIABUE 



link 1) \bid fields in the ADJUSTING 

Failed Topok)sy_Row associated with 

(LF)l_ my neighbor on the link 

LF.link as follows. 

nbr_iiode trecv__link[UUink]] 

= VOID_NODE; 

rev__link [rccv_Jink(LFJink]l 

= VOID_JLINK; 

2) V>id 6ck)s in my 
TopDlogy_Row as foUows: 
ntr_node[LF.Unkl = VOID_NODE; 
reY_JinklLF.link] = VOID_JJNK; 

3) Send an UPDAIE of 
Topology_Row to evciy neighbor 
conoected by the active link 

as follows. 

UPDArEjocv_node = neighbor's 
node_id; 

UPDATE jccv_Jink - link; 
UPDATED = AULJ^ODES; 
UPDArETopolQgy_Row = my 
Topology_Row; 

4) Start the timer with tinx; 
delay T02. 

Link^_Add- 1) If there is no active link, ADJUSTING 
ed CLA) le., my Topok)gy_Row is 
clear. 

Send INTT to the neighbor on 
the link LA.link as follows. 
INTr.rccv_i»de = ALL NODES; 
INrr.iecv_Jink - VOID LINK; 
2) Start the timer with time 
delay IDl. 
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It should be understood that the foregoing description is 
only illQStrative of the invention. Various alternatives and 
modificadons can be devised by those sidlled in the art 
without departing from the invention. Accordingly, the 
present invention is intended to embrace all such 
alternatives, modiAcatioDS and variances which fall within 
the scope of the appended claims. 

We claim: 

1. A system manifesting a reconfigiirable topology of 
individual data processing nodes, each node connectable to 
plural other nodes via full duplex interconnect links, a node 
directly connected to another node via a link hereafter 
referred to as a neighbor node, each node including proces- 
sor means, memory and programming means for enabling 
discovery of said reconfiguraWe topology, an originating 
node further comprising: 

transmitter means for transmitting an initial identifier 
message ID msg over each interconnect link emanating 
firom said originating node, each n> msg including an 
originating node link identifier for the link over which 
said ID msg is transmitted; 

receiver means for receiving an acknowledgement mies- 
sage ACK msg fi?om each neighbor node, each ACK 
msg including a neighbca* node link identifier for the 
link over which said ACK msg was transmitted to said 
originating node; 

first logic means for constructiog and storing in said 
memory, a topology table to include data from received 
ACK msgs, said topology table including an originat- 
ing node entry, said originating node entry including an 
originating node identifier, an originating node link 
identifier associated with a ndghtxir node identifier 
from which an Ack msg is received, and a neighbor 
node Unk identifier for said link that is identified by said 
originating node link identifier; and 

update means for causing said transmitter means to trans- 
mit to all other nodes, said originating node entry and 
for causing topology table entries received from othca* 
nodes to be entered in said topology table of said 40 
originating node, said originating node retransmitting a 
received topology table entry firom another node to a 
frntber node only if said received topology table entry 
differs firom a corresponding entry in the topology table 
of said originating node, whereby all nodes in said 
system are enabled to thereafter identify the topology 
of said system. 

2. The system as recited in claim 1, fuitticr comprising: 
means in said first logic means for providing a link 

timeout signal at an expiration of a first time interval, 
and for associating a null value in said topology table 
for any link over which no ACK msg is received prior 
to said link timeout signal. 

3. The system as recited in claim 2, wherein said link 
timeout signal is issued upon expimtion of a time period for 
said ACK msg to be received from a neighbor node in 
re^nse to an ID n^sg. 

4. The system as recited in claim 2, wherein each ACK 
m^ includes a copy of at least a portion of said ID msg from 
said originating node to enable said originating node to 
determine over which Hnk said ACK msg was received. 

5. The system as recited in claim 1, wherein said update 
means is further responsive to an event in said system whidi 
causes a topology of nodes connected to links of said 
origmating node to change, to cause ID msgs to be trans- 
mitted by said transmitter means and to cause retransmission 
of said originating node entry to aU other nodes after making 



any dianges to said originating node entry as a result of 
received ACK msgs, if any. 

6. A method for enabling individual data processing nodes 
in a nmlti-node data processing system to derive and update 
a topology for said system, without requiring initial infor- 
mation regarding said topology, a node directly connected to 
another node via a conomunication link hereafter referred to 
as a neighbor node, each node including processor means, 
memory and programming means for enabling discovery of 
said topology, said mtethod comprising: 

transmitting an initial identifier message ID msg over 
each interconnect Unk emanating from an originating 
node, each ID msg including originating node link 
identifier for the link over which said ID msg is 
transmitted; 

receiving an acknowledgement message ACK msg from 
eadi connected and operable neighbor node, each ACK 
msg including a neighbor node link identifier for the 
link over which said ACK msg was transmitted to said 
originating node; 
constructing and storing in said memory, a topology table 
to include data from received ACK msgs, said topology 
tabic including an originating node entry and at least a 
neighbor node entry for each operable neighbor node 
from whidi an ACK msg is received, each entry 
including a node identifier, an originating node link 
identifier associated with a nei^bor node identifier 
from which an Ack msg is received via said link that is 
identified by said originating node link identifier, and a 
neighbor node link identifier for said link that is iden- 
tified by said originating node link identifier; and 
transmitting to all other nodes, said originating node entry 
and causing topology table entries received from other 
nodes to be entered in said topology table of said 
originating node, said originating node retransmitting a 
received topology table entry from another node to a 
further node only if said received topology table entry 
differs from a corresponding entry in the topology table 
of said originating node, whereby nodes in said system 
are enabled to thereafter identify the topology of said 
system 

7. The method as recited in claim 6, further compsising 
the steps of: 

providing a link timeout signal at an expiration of a first 

time interval; and 
associating a null value in said topology table for any link 
over which no ACK msg is received prior to said link 
timeout signaL 

8. The method as recited in claim 7, wherein said Unk 
so timeout signal is provided upon expiration of a time period 

for an ACK msg to be received from a neighbor node in 
response to an ID msg 

9. The method as recited in claim 7, wherein each ACK 
msg includes a copy of at least a portion of said ID msg from 
said originating node to enable said originating node to 
determine over which link said ACK msg was received. 

10. The method as recited in claim 6, comprising the 
further steps of: 

responding to an event in said system which causes a 
topology of nodes connected to links of said originating 
node to change, by causing ID msgs to again be 
transmitted; and 
retransmitting said originating node entry to aU other 
nodes after making any changes to said originating 
node entry as a result of received ACK msgs, if any. 
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